Entry Name:  "USF-Tuladhar-MC1"

VAST Challenge 2017
Mini-Challenge 1

 

 

Team Members:

Anwesh Tuladhar, University of South Florida, , atuladhar@mail.usf.edu PRIMARY
Sulav Malla, University of South Florida, sulavmalla@mail.usf.edu
Ghulam Jilani Quadri, University of South Florida, ghulamjilani@mail.usf.edu

Dr. Paul Rosen, University of South Florida, Tampa FL, prosen@usf.edu 

Student Team:  YES

 

Tools Used:

Processing 3

Tableau

Apache Spark

Excel

 

Approximately how many hours were spent working on this submission in total?

80 Hours

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2017 is complete? YES

 

Video

Link to Youtube

 OR

http://eng.usf.edu/~sulavmalla/sulav_malla_files/vast2017/USF-Tuladhar-MC1-Video.mpg

OR

http://eng.usf.edu/~sulavmalla/sulav_malla_files/vast2017/USF-Tuladhar-MC1-Video.wmv

 

 

 

Questions

1“Patterns of Life” analyses depend on recognizing repeating patterns of activities by individuals or groups. Describe up to six daily patterns of life by vehicles traveling through and within the park. Characterize the patterns by describing the kinds of vehicles participating, their spatial activities (where do they go?), their temporal activities (when does the pattern happen?), and provide a hypothesis of what the pattern represents (for example, if I drove to a coffee house every morning, but did not stay for long, you might hypothesize I’m getting coffee “to-go”). Please limit your answer to six images and 500 words.

 

For the “Patterns of Life” analyses, first we enrich the given data in two steps:

 

The map.

We developed a tool in Processing to represent the given map as a weighted graph. First, we scan the map to find all the sensor locations, which represents the nodes of the graph. Then we perform a modified Depth First Search to find all the paths between all the nodes, which represents the edges. The edges are weighted by the distance between the two nodes. Using this graph, we can plot any path on the map. We also export data from this graph as a csv,

Sensor data.

We developed another tool in Spark to aggregate the sensor data and combine it with the graph data. We group the sensor data by “car-id” to trace the path followed by each car in a day. Each such record now represents a trip for a car. We prevent dangling trips by considering edge cases where a trip spans two days by applying a heuristic that a trip can end either at a camping site, ranger-base or an entrance. We use the graph data to calculate the distance travelled, time taken and average speed for the trip. As a heuristic, we choose the path where the speed of travel is closest to the speed limit when multiple paths exist between sensors. We also maintain start gate, end gate and the day of week for each trip. We also noticed that many trips follow the same path in forward and reverse directions. So, we calculate the hash of the forward and reverse paths for easier grouping during visualization. 

 

Top 6 Patterns.

Using Tableau and the enriched data, we plot the most frequently used paths which enter and exit the preserve within a day. From Figure 1, we can see that the top 10 paths have a much higher count than the rest. Figure 1 also shows the types of vehicles following those paths and that all types of vehicles follow these paths.

 

 

Figure 1. Top 15 daily patterns through the preserve

 

The top 6 paths can be seen as a subway map in figure 2.

 

 

 

 

 

 

Figure 2. Top 6 paths as a subway map

 

In figure 3, we can see that the vehicles travel in both directions equally.

 

 

Figure 3. Top 6 paths separated by direction of travel

 

In figure 4, we look at the distribution of the path usage over different days of the week. In figure 5, we see the path usage over 24 hours in 30 minute intervals. From these, we can see that these paths are travelled regularly in both directions, throughout the week and in all hours of the day.

 

 

Figure 4. Top 6 paths by week of day

 

 

Figure 5. Start time distribution for top 6 paths.

 

In figure 6, we see that all vehicles have an average speed of 35-36 mph. From this, we can assume that the cars are not stopping anywhere on these paths.

 

 

Figure 6. Average speed of the vehicles following the top 6 paths

 

Conclusion.

From these figures, we conclude that the top daily patterns followed by the cars are to go through the preserve.

 

 

 

 

 

2Patterns of Life analyses may also depend on understanding what patterns appear over longer periods of time (in this case, over multiple days). Describe up to six patterns of life that occur over multiple days (including across the entire data set) by vehicles traveling through and within the park. Characterize the patterns by describing the kinds of vehicles participating, their spatial activities (where do they go?), their temporal activities (when does the pattern happen?), and provide a hypothesis of what the pattern represents (for example, many vehicles showing up at the same location each Saturday at the same time may suggest some activity occurring there each Saturday). Please limit your answer to six images and 500 words.

 

For patterns spanning multiple days, we further enrich the data in single day analysis by combining the trips that have not yet exited the preserve and adding daily records information, end destination of each day spent in the preserve and total days spent in the preserve.

 

In figure 1, we see the top paths spanning multiple days. The top 6 patterns have a frequency of over 85. The top 5 paths have camping5 as the camping destination.

Figure 1. Top paths spanning multiple days

 

 

 

 

In figure 2, we show the subway map plots for the top 6 multiple day travel patterns.

 

 

 

Figure 2. Top 6 paths spanning multiple days as a subway map

 

Figure 1 suggests that camping5 is the most popular camping site and we conform that from figure 3 where we see the top camping destinations. We also see that campers have vehicles of type 1, 2 and 3. From figure 4, we see that camping5 is the most popular camp site throughout the year.

Figure 3. Popular camp sites

 

 

 

 

Figure 4. Popular camp sites by month

 

 

In figure 4, we look at the hours of the day when campers enter the preserve. We see that over all camp sites, vehicles start at 6 am and have entered the park by 5:30 pm. This suggests that the campers require some sort of permit from the preserve, and the office is only open from 6 am to 5:30 pm.

 

 

 

Figure 5. Start time distribution for camp sites

 

 

Camping1 seems to be the least popular camping site. But many cars (155 in total) still do pass through camping1 and spend time there. Figure 5 shows details of trips travelling through camping1 and spending time there. This suggests that camping 1 might be popular for day time hiking.

 

 

 

 

Figure 6. a) All trips passing through camping1. b) Trips passing camping1 and finally camping else where

 

 

 

3Unusual patterns may be patterns of activity that changes from an established pattern, or are just difficult to explain from what you know of a situation. Describe up to six unusual patterns (either single day or multiple days) and highlight why you find them unusual. Please limit your answer to six images and 500 words.

 

Pattern 1.

In figure 1, we see that path 6 is the least popular in the month of May but the most popular path in the month of July. This change is unusual as this path includes the section general-gate1:ranger-stop2:ranger-stop0:general-gate2. This section is special because the preserve is divided into two sides which is only connected by 2 sections. One of them is this section and the other one (gate6:ranger-stop6:gate5) is not allowed for general public.

 

 

 

Figure 1. Change in popularity of top single day path usage

 

 

Pattern 2.

 

In figure 2, we breakdown the top multiple day paths from question 2 by the month. We see that the most popular path to camping5 suddenly drops from July to August. This can indicate some changes occurring in that area in that time.

 

 

Figure 2. Drop in multiple day path usage from July to August

 

 

Pattern 3.

 

In figure 3, we see that on July 10, 2015 six cars of type 1 travel the same path from entrance1 to ranger-stop1 around the same time. This activity raises suspicion as none of the cars trigger the sensor on gate2 although it must be passed while going from entrance1 to ranger-stop1. Also, access to gate2 is only allowed for park rangers with car type 2P.

Figure 3. Illegal car type 1.

 

Pattern 4.

 

In figure 4a, we plot all the paths passing a “gate”. Only ranger vehicles can pass these gates. But we see that 23 cars of type 4 also pass these gates. In figure 4b, we see that these cars take this same path always on Tuesdays and Thursdays. And from figure 4c, we see that these paths are travelled throughout the year and around the same time from 2 pm to 4:30 pm, suggesting that something fishy is happening in this route in those days.

 

Figure 4. a) All the paths which include a “gate”. Color indicates the car type. b) The day of week when cars of type 4 travel this path. c) The time of travel breakdown by months.

 

 

 

 

Pattern 5.

 

We found that a large number of people are spending more than 10 days (up to 32 days) in the camp sites (Figure 5.a, 5.b). We also established in question2 that camping1 is the least popular camping site. Even there, we see from figure 5.c that  some people spend an extended amount of time (up to 13 days) in this site. Although extended camping is allowed in the reserve, we suspect these people might have a different agenda to stay for extended periods of time in all of the camp sites.

 

 

 

 

Figure 5. a) Number of people spending more than 10 days in the park camp sites. b) Breakdown by number of days spent in the park. c) Days spent in Camping1

 

 

Pattern 6.

 

We found a person who has spent 350 days in the park and still hasn’t left. In figure 6, we see the number of days he spent in each camp site with the date when he started staying shown on top. We find this highly unusual. Also, even after spending almost a year in the preserve, this person is yet to spent a day in camping1, which makes statistics from figure 4 even more suspicious.

 

 

Figure 6. Days spent in each camp site by the person who spent almost a year in the park.

 

 

4 –– What are the top 3 patterns you discovered that you suspect could be most impactful to bird life in the nature preserve? (Short text answer)

The top 3 patterns that we suspect is causing the most impact on bird life in the preserve are:

a.     The daily through traffic involves all the entrances to the preserve and causes disturbances to the wild life 24 x 7 throughout the preserve. This must be causing significant impact on the birdlife in the preserve.

b.     The unusual pattern which occurs on Tuesdays and Thursdays throughout the year is also highly suspicious. The activities they are performing might also be hurting bird life in the preserve.

c.     The campers staying for extended number of days in the park might also be disturbing the natural habitat of the birds in those areas.